Computer systems have enjoyed an exponential growth in processor speed for the last 20 years, while DRAM main memory speed improves only moderately. Today a cache miss to main memory takes hundreds of processor cycles, and this gap between processor and DRAM speeds is widening exponentially. Recent database performance studies have pointed out that about 50% or more of the execution time of database systems in main memory is often wasted due to cache misses. On the other hand, several decades of database studies have invented many I/O techniques to better tolerate the disk-to-memory gap. Increasingly larger main memories in server systems can hold more and more hot data in databases. Therefore, the memory-to-cache gap is becoming the major performance problem for database systems.
In this talk, I rethink database system designs to exploit modern computer architecture features. In particular, I exploit the CPU cache hierarchy and software prefetch instructions supported by most modern processors for relational database systems. Cache prefetching can load data into the cache before use, thus reducing the impact of cache misses. I focus on important structures and algorithms with intriguing, non-contiguous memory access patterns: the B+-Tree index, and the hash join algorithm. I will present novel prefetching techniques for B+-trees and hash joins. I will report on the results of extensive performance studies, demonstrating the dramatic performance improvements of the prefetching techniques over traditional algorithms and other cache friendly schemes.